Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify the task CPU and memory (IT-4056) #83

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

tschaffter
Copy link
Contributor

Closes https://sagebionetworks.jira.com/browse/IT-4056

Changelog

  • Add a class for specifying valid task CPU and memory pairs.
  • Set container max memory to align with task max memory.

@tschaffter tschaffter self-assigned this Dec 13, 2024
@tschaffter tschaffter marked this pull request as ready for review December 13, 2024 05:06
@tschaffter tschaffter requested review from a team as code owners December 13, 2024 05:06
@tschaffter
Copy link
Contributor Author

I realize that each task must share its resources with two more containers:

  • ecs-service-connect: no CPU and memory limit visible in AWS Console.
  • aws-guardduty-agent: hard max memory limit set to .125 GB.

One solution is to increment the memory made available to the task. The increment is usually 1 GB, which costs $3.20 / month (see Fargate pricing). So for 13 tasks, that about $40 / month / environment or a total of $1440 / year.

Alternatively, we could keep the memory allocated to the tasks as defined in this PR, but reduce the memory allocated to the OC container. Unlike the task definition, containers added to a task can have their memory set freely (no fixed increments) as long as the value is not larger that the memory allocated to the task.

memory_limit_mib (Union[int, float, None]) – The amount (in MiB) of memory to present to the container. If your container attempts to exceed the allocated memory, the container is terminated. At least one of memoryLimitMiB and memoryReservationMiB is required for non-Fargate services. Default: - No memory limit.

@BryanFauble
Copy link

ecs-service-connect: no CPU and memory limit visible in AWS Console.

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-connect-concepts-deploy.html#service-connect-concepts-proxy

These are the numbers they recommend:

For task definitions, you must set the CPU and memory parameters.

  1. We recommend adding an additional 256 CPU units and at least 64 MiB of memory to your task CPU and memory for the Service Connect proxy container. On AWS Fargate, the lowest amount of memory that you can set is 512 MiB of memory. On Amazon EC2, task definition memory is required.

For the service, you set the log configuration in the Service Connect configuration.

  1. If you expect tasks in this service to receive more than 500 requests per second at their peak load, we recommend adding 512 CPU units to your task CPU in this task definition for the Service Connect proxy container.
  2. If you expect to create more than 100 Service Connect services in the namespace or 2000 tasks in total across all Amazon ECS services within the namespace, we recommend adding 128 MiB of memory to your task memory for the Service Connect proxy container.
  3. You should do this in every task definition that is used by all of the Amazon ECS services in the namespace.

Alternatively, we could keep the memory allocated to the tasks as defined in this PR, but reduce the memory allocated to the OC container. Unlike the task definition, containers added to a task can have their memory set freely (no fixed increments) as long as the value is not larger that the memory allocated to the task.

I like this idea as the first approach. We can then use cloudwatch metrics and adjust from there if we need to bump up to the next valid memory/cpu config.

from enum import Enum


class FargateCpuMemory(Enum):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not a fan of this because these combinations are already kept in AWS and they could change over time. Keeping a copy here would introduce a maintenance burden of keeping this up to date. Instead how about we just query AWS for the correct combination and if the user doesn't provide the right combination throw an exception and provide a link for the user to lookup valid combos in AWS?

@zaro0508
Copy link
Contributor

zaro0508 commented Dec 23, 2024

One solution is to increment the memory made available to the task. The increment is usually 1 GB, which costs $3.20 / month (see Fargate pricing). So for 13 tasks, that about $40 / month / environment or a total of $1440 / year.

Alternatively, we could keep the memory allocated to the tasks as defined in this PR, but reduce the memory allocated to the OC container. Unlike the task definition, containers added to a task can have their memory set freely (no fixed increments) as long as the value is not larger that the memory allocated to the task.

I'm not a fan of either of these solutions because both would require management of both task and container memories. I suggestion we change to only set the task cpu and memory and don't set the container memory at all. This would allow all of the containers in a ECS task to share the cpu and memory defined at the task level. This seems like the easiest solution. This article on how ECS memory and cpu settings work helped me understand how those settings work, particularly the section on Scenarios for different memory configurations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants